Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HA, simplify provisioning, add Vagrant test cluster #228

Merged
merged 5 commits into from
Nov 8, 2023

Conversation

dereknola
Copy link
Member

@dereknola dereknola commented Nov 8, 2023

Changes:

  • Simplified install:
    • Use the K3s install script to download the correct version of K3s, no need to have seperate steps per Architecture. Also enables the use of k3s-killall.sh and k3s-uninstall.sh scripts for future playbooks
    • Reworked service files to be up-to-date. Expanded number to include cluster-init and secondary server options.
    • Removed temporary k3s-init node. Now we just launch the service correctly the first time
    • Require the use of a user defined token. This greatly simplifies startup flow, as secondary nodes don't need to extract the randomly generated token from the inital node. Also allows use of ansible vault to provide an encrypted token value.
  • While initial code for HA was landed in Add support for HA cluster using embedded etcd #210, it did not acutally work for mutli server setups. HA is now working correctly.
  • Add Vagrantfile for local testing. This provisions a 5 node cluster of ubuntu VMs, allowing local testing of the ansible playbooks.
  • Warn for several linting rules instead of failing

Resolves Issues:

Signed-off-by: Derek Nola [email protected]

@dereknola dereknola merged commit 2e1da47 into master Nov 8, 2023
2 checks passed
@dereknola dereknola mentioned this pull request Nov 9, 2023
@dereknola dereknola deleted the default_k3s_script branch November 9, 2023 20:32
@bubylou
Copy link
Contributor

bubylou commented Nov 9, 2023

@dereknola what issues were there with the previous HA setup? I'm running a bare metal K3S cluster using that role and wanted to know if I needed to fix anything. I like your method of setting the token from the start better, much simplier and easier to maintain.

@dereknola
Copy link
Member Author

Hey @bubylou, The issue I was seeing was with the hostvars properly propagating to the other nodes, so agent-0 would be unable to join.

Testing on commit 9ecdc93, right before I switched to defined token, calling vagrant up server-0 agent-0 (So a sqlite server with a single agent) I get the following error:

TASK [k3s/agent : Copy K3s service file] ***************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ansible.errors.AnsibleUndefinedVariable: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'token'. 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'token'
fatal: [agent-0]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'token'. 'ansible.vars.hostvars.HostVarsVars object' has no attribute 'token'"}

With the hostvar section in question being:

--token {{ hostvars[groups['server'][0]]['token'] }}

Inside the k3s-agent.service.j2 file

Its possible that this is a Vagrant inventory specific issue, but after about an hour of debugging I was never able to get the other nodes to see the hostvar token from server-0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants